Technical comments

This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code. You can also view the current state of the notebook as a well-formatted HTML document, if you click preview just above this document.

Try executing a chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Cmd+Shift+Enter. You can also run the next chunk of code via Cmd+Alt+`. (These commands are valid on Mac, might differ on Windows)

When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Cmd+Shift+K to preview the HTML file).

The preview shows you a rendered HTML copy of the contents of the editor. Preview does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed. If you wish to get a full-fledged HTML output (that includes the interactive plots inside the document, rather than in seperate viewer windows), choose Knit to HTML instead of the Preview Notebook button.

Loading libraries

In R (and many other programming language) reusable code written by others is distributed via libraries. In order to use those pieces of code, we first need to install the packages (this has been done previously for all users, as it takes 2+ hours to install all required packages), then we need to load them within our current session.

# Load libraries
library(magrittr) # Pipe %>% operation for clean coding
library(SingleCellExperiment) # Data container (https://bioconductor.org/packages/release/bioc/vignettes/SingleCellExperiment/inst/doc/intro.html)
library(pcaMethods) # Linear DimRed methods (PCA and extensions)
library(RDRToolbox) # Non-linear DimRed methods (LLE, Isomap)
library(ggbiplot)
library(plotly)
library(sparsepca)
library(scran) # scRNA-seq methods, By Aaron Lun, Cambridge researcher
library(tidyverse)
library(printr)
cat("Libraries successfully loaded\n")
Libraries successfully loaded

Reading and understanding the data

# Read the data
data_zeisel <- readRDS("zeisel.rds")
# Get basic information of the data
print(data_zeisel)
class: SingleCellExperiment 
dim: 19972 3005 
metadata(0):
assays(2): counts logcounts
rownames(19972): Tspan12 Tshz1 ... Gm21943_loc3 Gm20738_loc3
rowData names(10): feature_symbol is_feature_control ... total_counts
  log10_total_counts
colnames(3005): X1 X1.1 ... X9.58 X9.59
colData names(30): clust_id cell_type1 ... pct_counts_ERCC is_cell_control
reducedDimNames(0):
spikeNames(1): ERCC
# Gene statistics
print(as_tibble(rowData(data_zeisel), rownames="gene_id"))
# Cell statistics
print(as_tibble(colData(data_zeisel), rownames="cell_id"))

Reducing dimensionality of the data

Step 1 - Remove uninformative genes

The easiest way to reduce the amount of our data is to select only a subset of the recorded genes to analyse.

Before we remove anything, we want to know in the end, how much information we are removing from the data by the subsetting of genes. To do this, we should first know, how much information is in the data. An easy proxy of “information” that we’ll use today is the amount of Variance in the data.

# In order to compute sample-wise variance, we need to center the data 
# (Column-wise, so basically find the mean of all data and substract that from each data point to serve as the new origin)
center_colmeans <- function(X) {
    Xmean = colMeans(X)
    X - rep(Xmean, rep.int(nrow(X), ncol(X)))
}
# Get total variance as the trace of the covariance matrix (but to save computation, we can rotate elements in trace as Tr[X X^T] = Tr[X^T X], then realise that this is just = sum(colSums(Xc^2)) )
get_total_matrix_variance <- function(X){
  Xc <- center_colmeans(X) 
  sum(colSums(Xc^2))
}
# Write a convenience function that directly computes the variance in our SCE data
get_total_data_variance <- function(sce_data, logcount = TRUE){
  if (logcount){
    sce_data %>% 
      logcounts() %>%
      get_total_matrix_variance() ->
      out
  } else {
    sce_data %>% 
      counts() %>%
      get_total_matrix_variance() ->
      out
  }
  
  # Return out
  out
}
# Get total variance in the (log counts) data
total_logcounts_variance <- get_total_data_variance(data_zeisel)
sprintf(paste0(
  "The total variance in the logcounts data is ", 
  get_total_data_variance(data_zeisel)
  ))
[1] "The total variance in the logcounts data is 36490381.8458941"

We may then remove genes that have very low total counts in the data:

# We first just remove the genes that have to small total counts across the whole dataset.
min_total_count_per_gene = 25
data_zeisel %>% 
  counts() %>% 
  rowSums() -> 
  tmp_gene_total_counts
  
data_zeisel[
  tmp_gene_total_counts >= min_total_count_per_gene, 
] ->
  data_zeisel_lowcount
  
# Of course let's check how much information we still have:
sprintf(paste0(
  "After removing low count genes, we retain ", 
  format(
    get_total_data_variance(data_zeisel_lowcount) / total_logcounts_variance *100, 
    digits=3
  ),
  "%% of the variance." 
  ))
[1] "After removing low count genes, we retain 94.5% of the variance."
sprintf(paste0(
  "We now have ", nrow(data_zeisel_lowcount), " / ", nrow(data_zeisel), " genes remain"
  ))
[1] "We now have 15251 / 19972 genes remain"

Next we check from the remaining genes, which one on average have the most variation in fold-changes across the different cells. As our scientific goal currently is to find a way to characterise how cells differ from one-another, this is a reasonable thing to do.

how_many_genes_to_keep = 500
# Next we check from the remaining genes, which one on average have the most fold_changes
data_zeisel_lowcount %>% 
  logcounts() %>%  # logcounts assay
  rowVars -> 
  gene_count_variances # save row-wise variances
# Add back the names (unfortunately rowVars deletets them)
names(gene_count_variances) <- rownames(data_zeisel_lowcount)
 
# Do the subsetting based on top N variances (by sorted gene name)
data_zeisel_lowcount[
  names((gene_count_variances %>% sort(decreasing = TRUE))[1:how_many_genes_to_keep]),
  ] ->
  data_zeisel_topgenes
# Of course let's check how much information we still have:
sprintf(paste0(
  "After keeping only the top ", how_many_genes_to_keep, " genes, we retain ", 
  format(
    get_total_data_variance(data_zeisel_topgenes) / total_logcounts_variance *100, 
    digits=3
  ),
  "%% of the variance." 
  ))
[1] "After keeping only the top 500 genes, we retain 12.1% of the variance."
# # Let's take a look at our expression heatmap
# data_zeisel_topgenes %>% 
#   scater::plotHeatmap(
#     features = 1:dim(.)[1],
#     cluster_rows = FALSE,
#     cluster_cols = FALSE
# )

Now we may wish to do some more advanced dimensionality reduction, that takes into account not just properties of single genes, but also how they interact. Principal Component Analysis (PCA) is one such methods, it attempts to reduce the dimensionality of the data not just by getting rid of single genes, but by finding linear combinations of gene expression patterns that are informative, and keep most of the variance intact.

# Run PCA on the data
results_prcomp <- prcomp(
  data_zeisel_topgenes %>% logcounts() %>% t() # Tranpose is necessary because base R algorithms think in data frames, where samples are in rows, and features are in columns 
)
plot_pca_result(results_prcomp, interactive = FALSE)

Now what have we learned, and was this helpful? Not really, interactive mode can point out outliers maybe (although PCA is not very robust to them). And what does PC1 and PC2 really mean?

We can look at what PC1 is:

as.data.frame(tibble::enframe(results_prcomp$rotation[,i][order(abs(results_prcomp$rotation[,i]), decreasing = TRUE)], value="weight"))
             name        weight
1            Plp1  0.1512231585
2             Trf  0.1184318670
3             Mal  0.1119303119
4            Apod  0.0991299868
5             Mog  0.0988120846
6            Car2  0.0954679536
7             Mbp  0.0937816511
8          Atp1b1 -0.0927071110
9             Cnp  0.0926628445
10          Ugt8a  0.0908801969
11         Snap25 -0.0868271964
12           Rtn1 -0.0853471471
13         Snhg11 -0.0853076663
14           Mobp  0.0852174670
15          Stmn3 -0.0851872221
16           Meg3 -0.0850614524
17           Ermn  0.0847810312
18          Enpp2  0.0844830955
19           Chn1 -0.0829452467
20            Mag  0.0808153672
21           Syt1 -0.0805356154
22          Cryab  0.0772241962
23           Snca -0.0771291675
24           Hpca -0.0765301504
25          Gria2 -0.0764177965
26         Tspan2  0.0755472332
27             Qk  0.0751080097
28          Gpm6a -0.0750386341
29         Cldn11  0.0750383274
30         Ppp3ca -0.0746356622
31            Cd9  0.0745513224
32         Atp2b1 -0.0737975062
33          Prkcb -0.0733255014
34            Nsf -0.0732832192
35          Basp1 -0.0732304888
36          Stmn2 -0.0731004359
37         Tubb2a -0.0727677277
38         Atp1a3 -0.0717133727
39          Ndrg4 -0.0714872184
40          Sept4  0.0707675484
41            Gsn  0.0695018571
42          Ywhah -0.0683160904
43            Dbi  0.0682783109
44           Gatm  0.0682077558
45         Scn2a1 -0.0673519326
46          Cmtm5  0.0665534230
47           Eno2 -0.0664872905
48         Grin2b -0.0664800573
49           Npc2  0.0662747616
50           Grm5 -0.0660794655
51           Ttc3 -0.0658067338
52           Qdpr  0.0652170197
53          Olfm1 -0.0649159001
54          Csrp1  0.0648917366
55           Nrgn -0.0648546351
56           Scg5 -0.0648122473
57          Gria1 -0.0646826834
58           Scd2  0.0646723108
59           Aspa  0.0635589232
60         Dlgap1 -0.0633828874
61           Thy1 -0.0630788533
62          Vsnl1 -0.0630204690
63  3110035E14Rik -0.0628254239
64        Tmem88b  0.0627953368
65          Ptgds  0.0625596742
66            Cpe -0.0622721679
67          Celf4 -0.0621523198
68        Tspan13 -0.0620052479
69           Syn2 -0.0618216797
70          Map1b -0.0617969243
71            Cck -0.0615945720
72           Ncdn -0.0609322847
73          Calm2 -0.0609191362
74         Opalin  0.0607256572
75         Ppp3r1 -0.0604381559
76         Pdlim2  0.0602987901
77           Napb -0.0599431118
78          Pcsk2 -0.0598603917
79         Cyfip2 -0.0595898481
80          Gap43 -0.0595753460
81           Dnm1 -0.0592848563
82         Gabra1 -0.0586533665
83          Uchl1 -0.0585312001
84           Nsg2 -0.0583097011
85     D3Bwg0562e -0.0582882344
86         Camk2b -0.0582699060
87         Slc8a1 -0.0581316051
88          Rnf13  0.0578877620
89        Rasgrp1 -0.0573868949
90         Ppp3cb -0.0571065831
91           Pja2 -0.0566133130
92          Psat1  0.0564942606
93          Ywhag -0.0564470492
94           Arf3 -0.0564274022
95           Cd81  0.0563185759
96          Celf2 -0.0563176087
97         Mllt11 -0.0557743640
98            Syp -0.0554515145
99         Pgm2l1 -0.0553307240
100         Grb14  0.0550364565
101 2900097C17Rik -0.0548232666
102         Kif5c -0.0546580146
103       Zcchc18 -0.0546222514
104         Pde1a -0.0543013065
105        Cx3cl1 -0.0541693367
106      Atp6v1g2 -0.0540465925
107       Prkar1b -0.0535963529
108        Tuba4a -0.0533701293
109          Glul  0.0533471101
110        Camk2a -0.0532664560
111         Camkv -0.0531537313
112         Rab3a -0.0530295619
113        Gabrb2 -0.0529285559
114         Phgdh  0.0526309896
115        Rbfox1 -0.0525520726
116         Snurf -0.0525338267
117        Mapk10 -0.0525098527
118       Neurod6 -0.0524221383
119         Cers2  0.0523790131
120          Pllp  0.0523475520
121           Gda -0.0521827446
122          Nptn -0.0520983115
123         Myt1l -0.0520870534
124         Gpr37  0.0519681311
125         Sepp1  0.0514071017
126       Slc12a2  0.0507642173
127         Prdx1  0.0507496979
128         Cpne6 -0.0507028278
129         Epha4 -0.0506862929
130         Fbxw7 -0.0506319256
131        Dbndd2  0.0504558601
132          Crym -0.0502029415
133          Mdh1 -0.0494478898
134       Ngfrap1 -0.0494437311
135   Atp6v0c-ps2 -0.0494388685
136        Taldo1  0.0493722719
137         Wasf1 -0.0491817144
138       Fam131a -0.0489454328
139        Gabrg2 -0.0489247762
140         Ywhaz -0.0486943091
141         Ptprn -0.0486062773
142        Rbfox3 -0.0485890538
143         Calm3 -0.0485730845
144         Aldoa -0.0483798589
145         Gria3 -0.0483568553
146        Elovl7  0.0482904341
147         Synj1 -0.0481565343
148         Nrxn1 -0.0480623880
149          Ensa -0.0480525767
150        Sc4mol  0.0479714658
151         Dclk1 -0.0479622603
152         Celf5 -0.0478839029
153          Syt4 -0.0476483340
154        Brinp1 -0.0475993711
155       Slc12a5 -0.0475082966
156       Slc38a2  0.0474808931
157          Chgb -0.0474575605
158          Sv2b -0.0472659590
159        Elmod1 -0.0472202969
160         Ywhab -0.0468300004
161        Atp2b2 -0.0467308753
162 2810468N07Rik  0.0466799870
163         Prkce -0.0465873218
164        Ociad2 -0.0465089745
165         Rab6b -0.0465061201
166         Cplx2 -0.0463002141
167           Ntm -0.0461266208
168       Plekhb1  0.0460448384
169        Maged1 -0.0458650137
170          Chl1 -0.0458469367
171      Atp6v1b2 -0.0455643746
172         Litaf  0.0455442211
173          Scg2 -0.0455396461
174          Syn1 -0.0453830808
175         Gng11  0.0453439649
176          Caly -0.0451921630
177         Rab3c -0.0450917523
178         Nell2 -0.0449689558
179         Ptprs -0.0448969629
180        Cdk5r1 -0.0447913693
181           Ptn  0.0446952291
182         Kalrn -0.0444743174
183        Sptan1 -0.0441070509
184   Evi2a-evi2b  0.0440892611
185         Tenm2 -0.0440533499
186        Snap91 -0.0439981760
187          Got1 -0.0438925457
188         Scn8a -0.0438730190
189         Ctxn1 -0.0438502903
190        Hmgcs1  0.0437288126
191          Lmo4 -0.0435216091
192       Gprasp1 -0.0435103715
193         Ptk2b -0.0433335751
194      Serpini1 -0.0431802040
195         Gpr22 -0.0431435281
196         Ndrg3 -0.0431336154
197        Arpp21 -0.0430369869
198        Zbtb18 -0.0429954556
199         Ppm1e -0.0428969110
200          Pld3 -0.0428733492
201          Nrn1 -0.0428579352
202      Cacna2d1 -0.0428296996
203          Tpi1 -0.0427684050
204         Cadm2 -0.0427190994
205          Pfn2 -0.0427116546
206        Stxbp1 -0.0426794041
207          Enc1 -0.0425854793
208        Atp2a2 -0.0425496999
209        Sh3gl2 -0.0424509160
210         Kcnd2 -0.0424075365
211         Olig1  0.0423191030
212         Ndrg1  0.0422824238
213           Pkm -0.0422624106
214          Sypl  0.0422417048
215         Reep3  0.0422159513
216         Tubb3 -0.0421805676
217          Ftl1  0.0421018533
218 2010300C02Rik -0.0420940218
219          Ldha -0.0420630656
220         Cyp51  0.0420126899
221          Gnas -0.0419969767
222           Gls -0.0419664334
223          Plk2 -0.0419307678
224         Nrxn3 -0.0419161435
225          Rgs4 -0.0416645522
226        Frrs1l -0.0415617046
227         Fabp5  0.0415396141
228       Camk2n1 -0.0415024241
229          Dgkg -0.0414756049
230       Atp6v1d -0.0414710915
231          Gjc3  0.0414326130
232       Atp6v1a -0.0413493097
233         Pgam1 -0.0413116182
234          Jam3  0.0410812858
235          Mtpn -0.0410730458
236        Kcnma1 -0.0409681496
237        Phldb1  0.0409512632
238         Mapk1 -0.0409205846
239        Bcl11b -0.0406004566
240        Kifap3 -0.0405706584
241         Itm2c -0.0405534298
242          Ahi1 -0.0405162447
243        Cnksr2 -0.0404707021
244         Sirt2  0.0404195013
245        Baiap2 -0.0404171252
246         Car14  0.0403645546
247        Rgs7bp -0.0403299095
248        Pgrmc1 -0.0402630230
249        Camta1 -0.0401132555
250          Sub1 -0.0400926961
251           Mt1  0.0400243836
252           Nov -0.0400221818
253          Dkk3 -0.0397158577
254       Slc48a1  0.0394878227
255          Rnf7  0.0391880301
256         Hsph1 -0.0390343487
257         Clic4  0.0387717705
258          Cmip -0.0387368877
259           Abr -0.0387214901
260          Gng5  0.0387148112
261          Ptma  0.0382611287
262         Rab6a -0.0382475033
263        Snap47 -0.0381483674
264        Tagln3 -0.0378067597
265        Tmeff2  0.0376942810
266          Nbea -0.0376460950
267          Npcd -0.0375058877
268         Cntn1 -0.0374838201
269         Kcnb1 -0.0371608676
270          Ddr1  0.0368337167
271         Cpne7 -0.0366024950
272        Fermt2  0.0364664002
273         Pebp1 -0.0364405455
274          Tmx4 -0.0363428485
275         S100b  0.0361816271
276          Cd63  0.0360047246
277         Idh3b -0.0359545133
278        Atpif1 -0.0359526163
279          Nme1 -0.0358935290
280         Zwint -0.0358505514
281          Wfs1 -0.0358335168
282       Slc25a3 -0.0356908168
283         Adcy1 -0.0355791795
284      Atp6v0d1 -0.0354647792
285        Atp1a2  0.0353791611
286          Miat -0.0353416561
287         Lamp2  0.0352614213
288          Bpgm  0.0351809175
289         Strbp -0.0350888357
290        S100a1  0.0350692795
291           Jun  0.0349345780
292          Chd3 -0.0349193740
293         Sorl1 -0.0348763799
294          Hprt -0.0348128948
295          Pfkp -0.0347275499
296       Dpy19l1  0.0345897250
297          Purb -0.0345548391
298          Ctsl  0.0345245541
299         Actr2 -0.0342891750
300        Mycbp2 -0.0341720906
301         Gnao1 -0.0340223115
302        Eif4a2 -0.0338912026
303        Atp5g1 -0.0338571570
304         Foxp1 -0.0335223976
305         Atp5b -0.0333563369
306          Rtn3 -0.0331263310
307         Degs1  0.0331071477
308          Arf1 -0.0328521406
309   Gm5506_loc1 -0.0325597546
310          Fth1  0.0325498729
311           Fos  0.0322522713
312         Fkbp3 -0.0322392160
313         Cadm3 -0.0321086889
314 1810037I17Rik  0.0319324290
315         Gp1bb -0.0319053548
316        Myl12b -0.0315982804
317         Prdx6  0.0314306838
318        Tspan7 -0.0313931474
319      Slc22a17 -0.0313699302
320          Apoe  0.0313442928
321          Prnp -0.0312669689
322         G3bp2 -0.0309682491
323      Atp6v1e1 -0.0308307440
324         Cplx1 -0.0307952131
325        Sptbn1 -0.0306580123
326         Cadm1 -0.0305719494
327         Ntrk2 -0.0304816949
328          Tcf4 -0.0303330992
329       Tsc22d1 -0.0303083772
330        Dynll2 -0.0301432939
331        Atp1b3  0.0300243036
332         Gstp1  0.0299448994
333          Ryr2 -0.0298833966
334         Usmg5 -0.0298498387
335         Trim2 -0.0297801245
336         Sparc  0.0297119223
337       Cttnbp2 -0.0294919603
338          Cnr1 -0.0290362603
339          Aatk  0.0290271034
340         Ptprd  0.0290240917
341        Slc1a3  0.0288899416
342        Tubb4a  0.0285761394
343         Hspa8 -0.0284383833
344         Alcam -0.0284081818
345         Cntn2  0.0283935376
346          Atrx -0.0280128494
347         Tubb5 -0.0276575303
348        Tuba1b -0.0275957265
349         Kif5b -0.0274051474
350         Tmod2 -0.0273414149
351         Mgst3 -0.0271767650
352         Cox7b -0.0271746624
353         Pcdh9  0.0269979876
354          Pls3 -0.0263914902
355       Slc25a4 -0.0263852952
356         Timp2 -0.0262615271
357        Tmsb10 -0.0261928223
358         Nrip3 -0.0261564714
359          Nnat -0.0261490593
360 1500004A13Rik  0.0261098136
361        Ndufa5 -0.0257939939
362        Ndufa4 -0.0252504077
363          Vmp1  0.0248372348
364          Cltc -0.0245993720
365       Slc24a2 -0.0244807357
366         Gstm5  0.0244175284
367          Ank2 -0.0244003473
368         Efnb3  0.0242479827
369         Rps20  0.0233663548
370         H3f3a  0.0233204344
371          Rpl7  0.0230251838
372         Ghitm -0.0229452321
373          Sat1  0.0228016035
374        Gm9846  0.0227920528
375        Ppap2b  0.0226466272
376         Dusp1  0.0225981751
377          Rps3  0.0225231343
378          Nfib -0.0223714513
379        Nap1l5 -0.0223112226
380          Dner -0.0220989694
381        Anks1b -0.0220467127
382      Kcnq1ot1 -0.0219293649
383          Fosb  0.0216528176
384         Gnai2  0.0214823086
385         Anxa5  0.0213896693
386     Secisbp2l  0.0210364166
387        Fkbp1a -0.0209442042
388        Cacnb4 -0.0209335898
389         Rps29  0.0208236413
390         Zfp36  0.0207796164
391       Morf4l2 -0.0207389994
392          Deb1  0.0206252529
393          Rps6  0.0206136313
394    Rps15a-ps4  0.0201819675
395        Dynll1 -0.0200864586
396         Rps24  0.0200615914
397         Kif1a -0.0197486490
398         Mfge8  0.0192450801
399          Pcp4 -0.0191849820
400         Nfasc  0.0190847374
401          Klf4  0.0189028215
402       Gpr37l1  0.0188821047
403           Id3  0.0187836638
404     Rpl34-ps1  0.0185598920
405        Ndufa1 -0.0185193873
406          Ldhb -0.0181796944
407       Sparcl1 -0.0181717998
408        Gm6402  0.0180548472
409          Cycs -0.0179566183
410          Glo1  0.0173871313
411           Mt2  0.0170529641
412         Mef2c -0.0169067624
413           B2m  0.0168546882
414         Rpl39  0.0165978355
415         Gpm6b  0.0165464299
416         Gstm1  0.0164136387
417         Ywhaq  0.0159843975
418         Rps18  0.0159232472
419       Slco1c1  0.0158574434
420           Omg  0.0157975236
421          Gja1  0.0157616906
422         Atp5o -0.0157050934
423         Kcna1  0.0154814495
424          Xist  0.0154764626
425        Luc7l3 -0.0154628511
426        Marcks -0.0154566344
427        Pea15a  0.0153910229
428          Btg2  0.0152158196
429        Dpysl2 -0.0151938131
430        Rps3a1  0.0151286709
431       Gm15421 -0.0151181240
432       Zfp36l1  0.0150688055
433         Cyr61  0.0149714074
434        Nfkbia  0.0148377995
435         Atp5k -0.0147365161
436          Ly6e -0.0146758061
437         Rpl32  0.0144245661
438         Itm2a  0.0143783961
439        Pcdh17 -0.0142548806
440         Ly6c1  0.0141353087
441 2900060B14Rik -0.0140206980
442        Atp5j2 -0.0139887152
443        Pla2g7  0.0138932220
444           Ubc  0.0135401507
445         Stmn4  0.0133879068
446          Mmd2  0.0129128004
447          Aqp4  0.0127933883
448         Ndrg2  0.0126087113
449          Zeb2 -0.0126017782
450          Bcan  0.0124597973
451          Pltp  0.0123852409
452        Ndufc1 -0.0123198216
453      Sh3bgrl3 -0.0120811567
454       Slco1a4  0.0120673511
455          Gad1 -0.0119723430
456          Gad2 -0.0119687650
457      Pisd-ps3 -0.0113540619
458           Bsg -0.0111193500
459         Sumo2 -0.0110710704
460        Cox7a2 -0.0110414553
461        Slc4a4  0.0109365063
462        Srsf11 -0.0108817401
463    Rpl31-ps12  0.0106882957
464         Aldoc  0.0104227025
465           Mif -0.0102354949
466         Srrm2 -0.0101836074
467         Acot7 -0.0101666357
468          Scg3 -0.0098045808
469        Atp5f1 -0.0097006093
470 2010107E04Rik -0.0094175915
471         Htr3a -0.0093346158
472           Id2 -0.0092830864
473         Aplp1  0.0088738078
474        Cox6b1 -0.0087876541
475         Atp5l -0.0087052693
476           Clu -0.0084545823
477          Cst3  0.0078706415
478          Hes1  0.0072921032
479         Acta2  0.0068004040
480         Ncald -0.0065495233
481          Tecr  0.0061355291
482         Atp5j -0.0058849356
483           Npy -0.0057826019
484         Syt11  0.0055345134
485        Ndufb4 -0.0054895872
486        Prkacb -0.0051883092
487           Ckb  0.0050882670
488          Egr1 -0.0037537489
489         Lars2  0.0037112272
490           Vip -0.0035437100
491           Sst -0.0035297699
492         Uqcrh  0.0033672429
493        Slc1a2 -0.0029527647
494       Osbpl1a  0.0029010677
495        Atp1b2 -0.0024692890
496        Tmsb4x -0.0020913769
497      Serpine2  0.0020746148
498        Tuba1a  0.0020134896
499         Efr3b  0.0010979011
500        Slc6a1  0.0008816388

Still not terribly helpful, but this represents some linear combination of (log) gene expressions. It actually means an equation for a single line in the space of all possible expression patterns. For a hypotethical “cell_x”, that has expression x_ for every gene, this equation would look something like this:

\[ \textrm{PC1} ( \textrm{cell_x} ) = 0.15 * x_\textrm{Plp1} + 0.11 * x_\textrm{Trf} - 0.08 * x_\textrm{Snap25} + \dots + 0.0008 * x_\textrm{Slc6a1} \\ = \vec{w}_1 \cdot \vec{x}\]

Ok so this means that when some of those genes in the equation are upregulated together, and some others are down-regulated at the same time. That’s great, I can go test that?! No, not really, this equation involves ALL measured genes, so it is not particularly useful to create new hypothesis.

What can we do about that?

``` 
  First, let's look at what PCA actually does! 
  (Continued on the board) 

```

Now with our extra penalty term for using too many genes to explain variance, we can make a trade-off between how much variance our PC1 explains, versus how many genes the \(w1\) vector involves to compute that PC1.

# Compute statistics for each PC1:
results_spca2

results_spca2 <- results_spca %>%

results_spca_all <- lapply(
  spca_alpha_vals, 
  
)

results_prcomp <- prcomp(
  data_zeisel_topgenes %>% logcounts() %>% t() 
  )

results_prcomp_normalcounts <- prcomp(
  data_zeisel_topgenes %>% counts() %>% t()
)

# Plot log counts PCA plot
tmp_plot_left <- plot_pca_result(results_prcomp, interactive = FALSE) # , dataset = data_zeisel_topgenes, show_celltype = TRUE)

# Plot normal counts PCA plot
tmp_plot_right <- plot_pca_result(results_prcomp_normalcounts, interactive = FALSE)
gridExtra::grid.arrange(tmp_plot_left, tmp_plot_right, ncol=2)
LS0tCnRpdGxlOiAic2NSTkEtc2VxIGFuYWx5c2lzIHZpYSBEaW1lbnNpb25hbGl0eSByZWR1Y3Rpb24iCm91dHB1dDoKICBodG1sX25vdGVib29rOgogICAgZGZfcHJpbnQ6IHBhZ2VkCiAgaHRtbF9kb2N1bWVudDoKICAgIGRmX3ByaW50OiBwYWdlZAplZGl0b3Jfb3B0aW9uczoKICBjaHVua19vdXRwdXRfdHlwZTogaW5saW5lCi0tLQoKYGBge3IgZWNobz1GQUxTRX0KIyBTZXQgbGlicmFyaWVzIHBhdGgKLmxpYlBhdGhzKG5ldz0iL3Vzci9sb2NhbC9saWIvUi9zaXRlLWxpYnJhcnkvMy41IikKCiMgU2V0IGJhc2Uga25pdHIgb3B0aW9ucwpsaWJyYXJ5KGtuaXRyKQpvcHRzX2NodW5rJHNldCgKICBtZXNzYWdlPUZBTFNFLAogIHdhcm5pbmc9RkFMU0UsCiAgbWF4LnByaW50ID0gMjAsCiAgcm93bmFtZXMucHJpbnQgPVRSVUUKICApCgpgYGAKCgojIyMgVGVjaG5pY2FsIGNvbW1lbnRzCgpUaGlzIGlzIGFuIFtSIE1hcmtkb3duXShodHRwOi8vcm1hcmtkb3duLnJzdHVkaW8uY29tKSBOb3RlYm9vay4gV2hlbiB5b3UgZXhlY3V0ZSBjb2RlIHdpdGhpbiB0aGUgbm90ZWJvb2ssIHRoZSByZXN1bHRzIGFwcGVhciBiZW5lYXRoIHRoZSBjb2RlLiBZb3UgY2FuIGFsc28gdmlldyB0aGUgY3VycmVudCBzdGF0ZSBvZiB0aGUgbm90ZWJvb2sgYXMgYSB3ZWxsLWZvcm1hdHRlZCBIVE1MIGRvY3VtZW50LCBpZiB5b3UgY2xpY2sgcHJldmlldyBqdXN0IGFib3ZlIHRoaXMgZG9jdW1lbnQuCgpUcnkgZXhlY3V0aW5nIGEgY2h1bmsgYnkgY2xpY2tpbmcgdGhlICpSdW4qIGJ1dHRvbiB3aXRoaW4gdGhlIGNodW5rIG9yIGJ5IHBsYWNpbmcgeW91ciBjdXJzb3IgaW5zaWRlIGl0IGFuZCBwcmVzc2luZyAqQ21kK1NoaWZ0K0VudGVyKi4gWW91IGNhbiBhbHNvIHJ1biB0aGUgbmV4dCBjaHVuayBvZiBjb2RlIHZpYSAqQ21kK0FsdCtgKi4gKFRoZXNlIGNvbW1hbmRzIGFyZSB2YWxpZCBvbiBNYWMsIG1pZ2h0IGRpZmZlciBvbiBXaW5kb3dzKQoKV2hlbiB5b3Ugc2F2ZSB0aGUgbm90ZWJvb2ssIGFuIEhUTUwgZmlsZSBjb250YWluaW5nIHRoZSBjb2RlIGFuZCBvdXRwdXQgd2lsbCBiZSBzYXZlZCBhbG9uZ3NpZGUgaXQgKGNsaWNrIHRoZSAqUHJldmlldyogYnV0dG9uIG9yIHByZXNzICpDbWQrU2hpZnQrSyogdG8gcHJldmlldyB0aGUgSFRNTCBmaWxlKS4gCgpUaGUgcHJldmlldyBzaG93cyB5b3UgYSByZW5kZXJlZCBIVE1MIGNvcHkgb2YgdGhlIGNvbnRlbnRzIG9mIHRoZSBlZGl0b3IuICpQcmV2aWV3KiBkb2VzIG5vdCBydW4gYW55IFIgY29kZSBjaHVua3MuIEluc3RlYWQsIHRoZSBvdXRwdXQgb2YgdGhlIGNodW5rIHdoZW4gaXQgd2FzIGxhc3QgcnVuIGluIHRoZSBlZGl0b3IgaXMgZGlzcGxheWVkLiBJZiB5b3Ugd2lzaCB0byBnZXQgYSBmdWxsLWZsZWRnZWQgSFRNTCBvdXRwdXQgKHRoYXQgaW5jbHVkZXMgdGhlIGludGVyYWN0aXZlIHBsb3RzIGluc2lkZSB0aGUgZG9jdW1lbnQsIHJhdGhlciB0aGFuIGluIHNlcGVyYXRlIHZpZXdlciB3aW5kb3dzKSwgY2hvb3NlICpLbml0IHRvIEhUTUwqIGluc3RlYWQgb2YgdGhlICpQcmV2aWV3IE5vdGVib29rKiBidXR0b24uCgoKIyBMb2FkaW5nIGxpYnJhcmllcwoKSW4gUiAoYW5kIG1hbnkgb3RoZXIgcHJvZ3JhbW1pbmcgbGFuZ3VhZ2UpIHJldXNhYmxlIGNvZGUgd3JpdHRlbiBieSBvdGhlcnMgaXMgZGlzdHJpYnV0ZWQgdmlhIGxpYnJhcmllcy4gSW4gb3JkZXIgdG8gdXNlIHRob3NlIHBpZWNlcyBvZiBjb2RlLCB3ZSBmaXJzdCBuZWVkIHRvIGluc3RhbGwgdGhlIHBhY2thZ2VzICh0aGlzIGhhcyBiZWVuIGRvbmUgcHJldmlvdXNseSBmb3IgYWxsIHVzZXJzLCBhcyBpdCB0YWtlcyAyKyBob3VycyB0byBpbnN0YWxsIGFsbCByZXF1aXJlZCBwYWNrYWdlcyksIHRoZW4gd2UgbmVlZCB0byBsb2FkIHRoZW0gd2l0aGluIG91ciBjdXJyZW50IHNlc3Npb24uCgpgYGB7ciBsb2FkIGxpYnJhcmllcywgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRX0KCiMgTG9hZCBsaWJyYXJpZXMKbGlicmFyeShtYWdyaXR0cikgIyBQaXBlICU+JSBvcGVyYXRpb24gZm9yIGNsZWFuIGNvZGluZwpsaWJyYXJ5KFNpbmdsZUNlbGxFeHBlcmltZW50KSAjIERhdGEgY29udGFpbmVyIChodHRwczovL2Jpb2NvbmR1Y3Rvci5vcmcvcGFja2FnZXMvcmVsZWFzZS9iaW9jL3ZpZ25ldHRlcy9TaW5nbGVDZWxsRXhwZXJpbWVudC9pbnN0L2RvYy9pbnRyby5odG1sKQpsaWJyYXJ5KHBjYU1ldGhvZHMpICMgTGluZWFyIERpbVJlZCBtZXRob2RzIChQQ0EgYW5kIGV4dGVuc2lvbnMpCmxpYnJhcnkoUkRSVG9vbGJveCkgIyBOb24tbGluZWFyIERpbVJlZCBtZXRob2RzIChMTEUsIElzb21hcCkKbGlicmFyeShnZ2JpcGxvdCkKbGlicmFyeShwbG90bHkpCgpsaWJyYXJ5KHNwYXJzZXBjYSkKCmxpYnJhcnkoc2NyYW4pICMgc2NSTkEtc2VxIG1ldGhvZHMsIEJ5IEFhcm9uIEx1biwgQ2FtYnJpZGdlIHJlc2VhcmNoZXIKCmxpYnJhcnkodGlkeXZlcnNlKQoKbGlicmFyeShwcmludHIpCgoKY2F0KCJMaWJyYXJpZXMgc3VjY2Vzc2Z1bGx5IGxvYWRlZFxuIikKCmBgYAoKIyBSZWFkaW5nIGFuZCB1bmRlcnN0YW5kaW5nIHRoZSBkYXRhCgoKCmBgYHtyfQojIFJlYWQgdGhlIGRhdGEKZGF0YV96ZWlzZWwgPC0gcmVhZFJEUygiemVpc2VsLnJkcyIpCgojIEdldCBiYXNpYyBpbmZvcm1hdGlvbiBvZiB0aGUgZGF0YQpwcmludChkYXRhX3plaXNlbCkKCiMgR2VuZSBzdGF0aXN0aWNzCnByaW50KGFzX3RpYmJsZShyb3dEYXRhKGRhdGFfemVpc2VsKSwgcm93bmFtZXM9ImdlbmVfaWQiKSkKCiMgQ2VsbCBzdGF0aXN0aWNzCnByaW50KGFzX3RpYmJsZShjb2xEYXRhKGRhdGFfemVpc2VsKSwgcm93bmFtZXM9ImNlbGxfaWQiKSkKYGBgCgoKIyBSZWR1Y2luZyBkaW1lbnNpb25hbGl0eSBvZiB0aGUgZGF0YQoKIyMjIyBTdGVwIDEgLSBSZW1vdmUgdW5pbmZvcm1hdGl2ZSBnZW5lcwoKVGhlIGVhc2llc3Qgd2F5IHRvIHJlZHVjZSB0aGUgYW1vdW50IG9mIG91ciBkYXRhIGlzIHRvIHNlbGVjdCBvbmx5IGEgc3Vic2V0IG9mIHRoZSByZWNvcmRlZCBnZW5lcyB0byBhbmFseXNlLgoKQmVmb3JlIHdlIHJlbW92ZSBhbnl0aGluZywgd2Ugd2FudCB0byBrbm93IGluIHRoZSBlbmQsIGhvdyBtdWNoIGluZm9ybWF0aW9uIHdlIGFyZSByZW1vdmluZyBmcm9tIHRoZSBkYXRhIGJ5IHRoZSBzdWJzZXR0aW5nIG9mIGdlbmVzLiBUbyBkbyB0aGlzLCB3ZSBzaG91bGQgZmlyc3Qga25vdywgaG93IG11Y2ggaW5mb3JtYXRpb24gaXMgaW4gdGhlIGRhdGEuIEFuIGVhc3kgcHJveHkgb2YgImluZm9ybWF0aW9uIiB0aGF0IHdlJ2xsIHVzZSB0b2RheSBpcyB0aGUgYW1vdW50IG9mIFZhcmlhbmNlIGluIHRoZSBkYXRhLgoKYGBge3J9CgojIEluIG9yZGVyIHRvIGNvbXB1dGUgc2FtcGxlLXdpc2UgdmFyaWFuY2UsIHdlIG5lZWQgdG8gY2VudGVyIHRoZSBkYXRhIAojIChDb2x1bW4td2lzZSwgc28gYmFzaWNhbGx5IGZpbmQgdGhlIG1lYW4gb2YgYWxsIGRhdGEgYW5kIHN1YnN0cmFjdCB0aGF0IGZyb20gZWFjaCBkYXRhIHBvaW50IHRvIHNlcnZlIGFzIHRoZSBuZXcgb3JpZ2luKQoKY2VudGVyX2NvbG1lYW5zIDwtIGZ1bmN0aW9uKFgpIHsKICAgIFhtZWFuID0gY29sTWVhbnMoWCkKICAgIFggLSByZXAoWG1lYW4sIHJlcC5pbnQobnJvdyhYKSwgbmNvbChYKSkpCn0KCiMgR2V0IHRvdGFsIHZhcmlhbmNlIGFzIHRoZSB0cmFjZSBvZiB0aGUgY292YXJpYW5jZSBtYXRyaXggKGJ1dCB0byBzYXZlIGNvbXB1dGF0aW9uLCB3ZSBjYW4gcm90YXRlIGVsZW1lbnRzIGluIHRyYWNlIGFzIFRyW1ggWF5UXSA9IFRyW1heVCBYXSwgdGhlbiByZWFsaXNlIHRoYXQgdGhpcyBpcyBqdXN0ID0gc3VtKGNvbFN1bXMoWGNeMikpID0gc3VtKFhjXjIpKQpnZXRfdG90YWxfbWF0cml4X3ZhcmlhbmNlIDwtIGZ1bmN0aW9uKFgpewogIFhjIDwtIGNlbnRlcl9jb2xtZWFucyhYKSAKICBzdW0oWGNeMikKfQoKIyBXcml0ZSBhIGNvbnZlbmllbmNlIGZ1bmN0aW9uIHRoYXQgZGlyZWN0bHkgY29tcHV0ZXMgdGhlIHZhcmlhbmNlIGluIG91ciBTQ0UgZGF0YQpnZXRfdG90YWxfZGF0YV92YXJpYW5jZSA8LSBmdW5jdGlvbihzY2VfZGF0YSwgbG9nY291bnQgPSBUUlVFKXsKICBpZiAobG9nY291bnQpewogICAgc2NlX2RhdGEgJT4lIAogICAgICBsb2djb3VudHMoKSAlPiUKICAgICAgZ2V0X3RvdGFsX21hdHJpeF92YXJpYW5jZSgpIC0+CiAgICAgIG91dAogIH0gZWxzZSB7CiAgICBzY2VfZGF0YSAlPiUgCiAgICAgIGNvdW50cygpICU+JQogICAgICBnZXRfdG90YWxfbWF0cml4X3ZhcmlhbmNlKCkgLT4KICAgICAgb3V0CiAgfQogIAogICMgUmV0dXJuIG91dAogIG91dAp9CgoKIyBHZXQgdG90YWwgdmFyaWFuY2UgaW4gdGhlIChsb2cgY291bnRzKSBkYXRhCnRvdGFsX2xvZ2NvdW50c192YXJpYW5jZSA8LSBnZXRfdG90YWxfZGF0YV92YXJpYW5jZShkYXRhX3plaXNlbCkKCnNwcmludGYocGFzdGUwKAogICJUaGUgdG90YWwgdmFyaWFuY2UgaW4gdGhlIGxvZ2NvdW50cyBkYXRhIGlzICIsIAogIGdldF90b3RhbF9kYXRhX3ZhcmlhbmNlKGRhdGFfemVpc2VsKQogICkpCgpgYGAKCldlIG1heSB0aGVuIHJlbW92ZSBnZW5lcyB0aGF0IGhhdmUgdmVyeSBsb3cgdG90YWwgY291bnRzIGluIHRoZSBkYXRhOgoKYGBge3J9CiMgV2UgZmlyc3QganVzdCByZW1vdmUgdGhlIGdlbmVzIHRoYXQgaGF2ZSB0byBzbWFsbCB0b3RhbCBjb3VudHMgYWNyb3NzIHRoZSB3aG9sZSBkYXRhc2V0LgptaW5fdG90YWxfY291bnRfcGVyX2dlbmUgPSAyNQpkYXRhX3plaXNlbCAlPiUgCiAgY291bnRzKCkgJT4lIAogIHJvd1N1bXMoKSAtPiAKICB0bXBfZ2VuZV90b3RhbF9jb3VudHMKICAKZGF0YV96ZWlzZWxbCiAgdG1wX2dlbmVfdG90YWxfY291bnRzID49IG1pbl90b3RhbF9jb3VudF9wZXJfZ2VuZSwgCl0gLT4KICBkYXRhX3plaXNlbF9sb3djb3VudAogIAoKIyBPZiBjb3Vyc2UgbGV0J3MgY2hlY2sgaG93IG11Y2ggaW5mb3JtYXRpb24gd2Ugc3RpbGwgaGF2ZToKc3ByaW50ZihwYXN0ZTAoCiAgIkFmdGVyIHJlbW92aW5nIGxvdyBjb3VudCBnZW5lcywgd2UgcmV0YWluICIsIAogIGZvcm1hdCgKICAgIGdldF90b3RhbF9kYXRhX3ZhcmlhbmNlKGRhdGFfemVpc2VsX2xvd2NvdW50KSAvIHRvdGFsX2xvZ2NvdW50c192YXJpYW5jZSAqMTAwLCAKICAgIGRpZ2l0cz0zCiAgKSwKICAiJSUgb2YgdGhlIHZhcmlhbmNlLiIgCiAgKSkKCnNwcmludGYocGFzdGUwKAogICJXZSBub3cgaGF2ZSAiLCBucm93KGRhdGFfemVpc2VsX2xvd2NvdW50KSwgIiAvICIsIG5yb3coZGF0YV96ZWlzZWwpLCAiIGdlbmVzIHJlbWFpbiIKICApKQoKYGBgCgpOZXh0IHdlIGNoZWNrIGZyb20gdGhlIHJlbWFpbmluZyBnZW5lcywgd2hpY2ggb25lIG9uIGF2ZXJhZ2UgaGF2ZSB0aGUgbW9zdCB2YXJpYXRpb24gaW4gZm9sZC1jaGFuZ2VzIGFjcm9zcyB0aGUgZGlmZmVyZW50IGNlbGxzLiBBcyBvdXIgc2NpZW50aWZpYyBnb2FsIGN1cnJlbnRseSBpcyB0byBmaW5kIGEgd2F5IHRvIGNoYXJhY3RlcmlzZSBob3cgY2VsbHMgZGlmZmVyIGZyb20gb25lLWFub3RoZXIsIHRoaXMgaXMgYSByZWFzb25hYmxlIHRoaW5nIHRvIGRvLgoKCmBgYHtyfQoKaG93X21hbnlfZ2VuZXNfdG9fa2VlcCA9IDUwMAoKIyBOZXh0IHdlIGNoZWNrIGZyb20gdGhlIHJlbWFpbmluZyBnZW5lcywgd2hpY2ggb25lIG9uIGF2ZXJhZ2UgaGF2ZSB0aGUgbW9zdCBmb2xkX2NoYW5nZXMKZGF0YV96ZWlzZWxfbG93Y291bnQgJT4lIAogIGxvZ2NvdW50cygpICU+JSAgIyBsb2djb3VudHMgYXNzYXkKICByb3dWYXJzIC0+IAogIGdlbmVfY291bnRfdmFyaWFuY2VzICMgc2F2ZSByb3ctd2lzZSB2YXJpYW5jZXMKCiMgQWRkIGJhY2sgdGhlIG5hbWVzICh1bmZvcnR1bmF0ZWx5IHJvd1ZhcnMgZGVsZXRldHMgdGhlbSkKbmFtZXMoZ2VuZV9jb3VudF92YXJpYW5jZXMpIDwtIHJvd25hbWVzKGRhdGFfemVpc2VsX2xvd2NvdW50KQogCiMgRG8gdGhlIHN1YnNldHRpbmcgYmFzZWQgb24gdG9wIE4gdmFyaWFuY2VzIChieSBzb3J0ZWQgZ2VuZSBuYW1lKQpkYXRhX3plaXNlbF9sb3djb3VudFsKICBuYW1lcygoZ2VuZV9jb3VudF92YXJpYW5jZXMgJT4lIHNvcnQoZGVjcmVhc2luZyA9IFRSVUUpKVsxOmhvd19tYW55X2dlbmVzX3RvX2tlZXBdKSwKICBdIC0+CiAgZGF0YV96ZWlzZWxfdG9wZ2VuZXMKCmRhdGFfemVpc2VsX3N1YnNldCA8LSBkYXRhX3plaXNlbF90b3BnZW5lcyAjIEp1c3QgZm9yIGNvbXBhdGliaWxpdHkgd2l0aCBoaXN0b3JpY2FsIGNvZGUuCgojIE9mIGNvdXJzZSBsZXQncyBjaGVjayBob3cgbXVjaCBpbmZvcm1hdGlvbiB3ZSBzdGlsbCBoYXZlOgpzcHJpbnRmKHBhc3RlMCgKICAiQWZ0ZXIga2VlcGluZyBvbmx5IHRoZSB0b3AgIiwgaG93X21hbnlfZ2VuZXNfdG9fa2VlcCwgIiBnZW5lcywgd2UgcmV0YWluICIsIAogIGZvcm1hdCgKICAgIGdldF90b3RhbF9kYXRhX3ZhcmlhbmNlKGRhdGFfemVpc2VsX3RvcGdlbmVzKSAvIHRvdGFsX2xvZ2NvdW50c192YXJpYW5jZSAqMTAwLCAKICAgIGRpZ2l0cz0zCiAgKSwKICAiJSUgb2YgdGhlIHZhcmlhbmNlLiIgCiAgKSkKCgpgYGAKCgpgYGB7ciBwY2FfcGxvdHRpbmdfZnVuY3Rpb24sIGVjaG89RkFMU0V9CgpwbG90X3BjYV9yZXN1bHQgPC0gZnVuY3Rpb24ocGNhX3Jlc3VsdCwgZGF0YXNldD1OVUxMLCBzaG93X2NlbGx0eXBlID0gRkFMU0UsIGludGVyYWN0aXZlPVRSVUUpewogICMgUGxvdCBQQ0EKICBpZiAoIXNob3dfY2VsbHR5cGUpewogICAgICBQMiA8LSBnZ2JpcGxvdChwY2FfcmVzdWx0LAogICAgICAgICAgICAgICAgICAgICBvYnMuc2NhbGUgPSAxLCAKICAgICAgICAgICAgICAgICAgICAgdmFyLnNjYWxlPTEsCiAgICAgICAgICAgICAgICAgICAgIGVsbGlwc2U9RiwKICAgICAgICAgICAgICAgICAgICAgY2lyY2xlPUYsCiAgICAgICAgICAgICAgICAgICAgIHZhcm5hbWUuc2l6ZT0wLjEsCiAgICAgICAgICAgICAgICAgICAgIHZhci5heGVzPUYsCiAgICAgICAgICAgICAgICAgICAgIGdyb3Vwcz1yZXAoMSwgdGltZXM9ZGltKHJlc3VsdHNfcHJjb21wJHgpWzFdKSwgCiAgICAgICAgICAgICAgICAgICAgIGFscGhhPTApICsgCiAgICAgICAgdGhlbWUobGVnZW5kLnBvc2l0aW9uPSJub25lIikKICAgICAgUDIkbGF5ZXJzIDwtIGMoZ2VvbV9wb2ludCgjYWVzKGNvbG9yPWRhdGFfemVpc2VsX3N1YnNldCRjZWxsX3R5cGUxKSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBjZXg9MS4wLCBhbHBoYT0wLjMpLCBQMiRsYXllcnMpCiAgfSBlbHNlIHsKICAgIGlmIChpcy5udWxsKGRhdGFzZXQpKSBzdG9wKCJDYW5ub3Qgc2hvdyBjZWxsIHR5cGVzLCBubyB0aGUgXCJkYXRhc2V0XCIgaW5wdXQgaXMgbWlzc2luZy4gVHJ5IGFnYWluIHdpdGggc2hvd19jZWxsdHlwZT1GQUxTRSwgb3Igc3VwcGx5IHRoZSBkYXRhc2V0IHRoZSBQQ0EgYW5hbHlzaXMgd2FzIHJhbiBvbiIpCiAgICAKICAgIFAyIDwtIGdnYmlwbG90KHBjYV9yZXN1bHQsCiAgICAgICAgICAgICAgICAgICAgICAgb2JzLnNjYWxlID0gMSwgCiAgICAgICAgICAgICAgICAgICAgICAgdmFyLnNjYWxlPTEsCiAgICAgICAgICAgICAgICAgICAgICAgZWxsaXBzZT1GLAogICAgICAgICAgICAgICAgICAgICAgIGNpcmNsZT1GLAogICAgICAgICAgICAgICAgICAgICAgIHZhcm5hbWUuc2l6ZT0wLjEsCiAgICAgICAgICAgICAgICAgICAgICAgdmFyLmF4ZXM9RiwKICAgICAgICAgICAgICAgICAgICAgICBncm91cHM9ZGF0YXNldCRjZWxsX3R5cGUxLCAKICAgICAgICAgICAgICAgICAgICAgICBhbHBoYT0wKQogICAgUDIkbGF5ZXJzIDwtIGMoZ2VvbV9wb2ludChhZXMoY29sb3I9ZGF0YXNldCRjZWxsX3R5cGUxKSwgCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGNleD0xLjAsIGFscGhhPTAuMyksIFAyJGxheWVycykKICB9CiAgCiAgIyBSZXR1cm4gdGhlIGdncGxvdF9vYmplY3QKICBpZiAoaW50ZXJhY3RpdmUpewogICAgUDJfb3V0IDwtIFAyCiAgICBQMl9vdXQkbGF5ZXJzW1syXV0kbWFwcGluZyR0ZXh0ID0gYWVzKHRleHQ9cm93bmFtZXMocGNhX3Jlc3VsdCR4KSkkdGV4dCAjIEFkZCBjZWxsIGlkIGFzIGhvdmVyIHRleHQgaW4gcGxvdGx5CiAgICBodG1sdG9vbHM6Omh0bWxfcHJpbnQoCiAgICAgICBQMl9vdXQgJT4lIGdncGxvdGx5KCksIAogICAgICAgdmlld2VyID0gZ2V0T3B0aW9uKCJicm93c2VyIiwgdXRpbHM6OmJyb3dzZVVSTCkKICAgICkKICB9CiAgCiAgUDIKfQoKYGBgCgoKYGBge3J9CgojICMgTGV0J3MgdGFrZSBhIGxvb2sgYXQgb3VyIGV4cHJlc3Npb24gaGVhdG1hcAojIGRhdGFfemVpc2VsX3RvcGdlbmVzICU+JSAKIyAgIHNjYXRlcjo6cGxvdEhlYXRtYXAoCiMgICAgIGZlYXR1cmVzID0gMTpkaW0oLilbMV0sCiMgICAgIGNsdXN0ZXJfcm93cyA9IEZBTFNFLAojICAgICBjbHVzdGVyX2NvbHMgPSBGQUxTRQojICkKCmBgYAoKTm93IHdlIG1heSB3aXNoIHRvIGRvIHNvbWUgbW9yZSBhZHZhbmNlZCBkaW1lbnNpb25hbGl0eSByZWR1Y3Rpb24sIHRoYXQgdGFrZXMgaW50byBhY2NvdW50IG5vdCBqdXN0IHByb3BlcnRpZXMgb2Ygc2luZ2xlIGdlbmVzLCBidXQgYWxzbyBob3cgdGhleSBpbnRlcmFjdC4gUHJpbmNpcGFsIENvbXBvbmVudCBBbmFseXNpcyAoUENBKSBpcyBvbmUgc3VjaCBtZXRob2RzLCBpdCBhdHRlbXB0cyB0byByZWR1Y2UgdGhlIGRpbWVuc2lvbmFsaXR5IG9mIHRoZSBkYXRhIG5vdCBqdXN0IGJ5IGdldHRpbmcgcmlkIG9mIHNpbmdsZSBnZW5lcywgYnV0IGJ5IGZpbmRpbmcgbGluZWFyIGNvbWJpbmF0aW9ucyBvZiBnZW5lIGV4cHJlc3Npb24gcGF0dGVybnMgdGhhdCBhcmUgaW5mb3JtYXRpdmUsIGFuZCBrZWVwIG1vc3Qgb2YgdGhlIHZhcmlhbmNlIGludGFjdC4KCmBgYHtyfQojIFJ1biBQQ0Egb24gdGhlIGRhdGEKcmVzdWx0c19wcmNvbXAgPC0gcHJjb21wKAogIGRhdGFfemVpc2VsX3RvcGdlbmVzICU+JSBsb2djb3VudHMoKSAlPiUgdCgpICMgVHJhbnBvc2UgaXMgbmVjZXNzYXJ5IGJlY2F1c2UgYmFzZSBSIGFsZ29yaXRobXMgdGhpbmsgaW4gZGF0YSBmcmFtZXMsIHdoZXJlIHNhbXBsZXMgYXJlIGluIHJvd3MsIGFuZCBmZWF0dXJlcyBhcmUgaW4gY29sdW1ucyAKKQoKCnBsb3RfcGNhX3Jlc3VsdChyZXN1bHRzX3ByY29tcCwgaW50ZXJhY3RpdmUgPSBGQUxTRSkKCmBgYAoKCk5vdyB3aGF0IGhhdmUgd2UgbGVhcm5lZCwgYW5kIHdhcyB0aGlzIGhlbHBmdWw/IE5vdCByZWFsbHksIGludGVyYWN0aXZlIG1vZGUgY2FuIHBvaW50IG91dCBvdXRsaWVycyBtYXliZSAoYWx0aG91Z2ggUENBIGlzIG5vdCB2ZXJ5IHJvYnVzdCB0byB0aGVtKS4gQW5kIHdoYXQgZG9lcyBQQzEgYW5kIFBDMiByZWFsbHkgbWVhbj8KCldlIGNhbiBsb29rIGF0IHdoYXQgUEMxIGlzOgoKYGBge3J9CmkgPSAxCmFzLmRhdGEuZnJhbWUodGliYmxlOjplbmZyYW1lKHJlc3VsdHNfcHJjb21wJHJvdGF0aW9uWyxpXVtvcmRlcihhYnMocmVzdWx0c19wcmNvbXAkcm90YXRpb25bLGldKSwgZGVjcmVhc2luZyA9IFRSVUUpXSwgdmFsdWU9IndlaWdodCIpKQpgYGAKClN0aWxsIG5vdCB0ZXJyaWJseSBoZWxwZnVsLCBidXQgdGhpcyByZXByZXNlbnRzIHNvbWUgbGluZWFyIGNvbWJpbmF0aW9uIG9mIChsb2cpIGdlbmUgZXhwcmVzc2lvbnMuIEl0IGFjdHVhbGx5IG1lYW5zIGFuIGVxdWF0aW9uIGZvciBhIHNpbmdsZSBsaW5lIGluIHRoZSBzcGFjZSBvZiBhbGwgcG9zc2libGUgZXhwcmVzc2lvbiBwYXR0ZXJucy4gRm9yIGEgaHlwb3RldGhpY2FsICJjZWxsX3giLCB0aGF0IGhhcyBleHByZXNzaW9uIHhfPGdlbmVfbmFtZT4gZm9yIGV2ZXJ5IGdlbmUsIHRoaXMgZXF1YXRpb24gd291bGQgbG9vayBzb21ldGhpbmcgbGlrZSB0aGlzOgoKCiQkIFx0ZXh0cm17UEMxfSAoIFx0ZXh0cm17Y2VsbF94fSApID0gMC4xNSAqIHhfXHRleHRybXtQbHAxfSArICAwLjExICogeF9cdGV4dHJte1RyZn0gLSAwLjA4ICogeF9cdGV4dHJte1NuYXAyNX0gKyBcZG90cyArIDAuMDAwOCAqIHhfXHRleHRybXtTbGM2YTF9IFxcID0gXHZlY3t3fV8xIFxjZG90IFx2ZWN7eH0kJAoKT2sgc28gdGhpcyBtZWFucyB0aGF0IHdoZW4gc29tZSBvZiB0aG9zZSBnZW5lcyBpbiB0aGUgZXF1YXRpb24gYXJlIHVwcmVndWxhdGVkIHRvZ2V0aGVyLCBhbmQgc29tZSBvdGhlcnMgYXJlIGRvd24tcmVndWxhdGVkIGF0IHRoZSBzYW1lIHRpbWUuIFRoYXQncyBncmVhdCwgSSBjYW4gZ28gdGVzdCB0aGF0PyEgTm8sIG5vdCByZWFsbHksIHRoaXMgZXF1YXRpb24gaW52b2x2ZXMgQUxMIG1lYXN1cmVkIGdlbmVzLCBzbyBpdCBpcyBub3QgcGFydGljdWxhcmx5IHVzZWZ1bCB0byBjcmVhdGUgbmV3IGh5cG90aGVzaXMuCgpXaGF0IGNhbiB3ZSBkbyBhYm91dCB0aGF0PwoKICAgIGBgYCAKICAgICAgRmlyc3QsIGxldCdzIGxvb2sgYXQgd2hhdCBQQ0EgYWN0dWFsbHkgZG9lcyEgCiAgICAgIChDb250aW51ZWQgb24gdGhlIGJvYXJkKSAKICAgIAogICAgYGBgCgpOb3cgd2l0aCBvdXIgZXh0cmEgcGVuYWx0eSB0ZXJtIGZvciB1c2luZyB0b28gbWFueSBnZW5lcyB0byBleHBsYWluIHZhcmlhbmNlLCB3ZSBjYW4gbWFrZSBhIHRyYWRlLW9mZiBiZXR3ZWVuIGhvdyBtdWNoIHZhcmlhbmNlIG91ciBQQzEgZXhwbGFpbnMsIHZlcnN1cyBob3cgbWFueSBnZW5lcyB0aGUgJHcxJCB2ZWN0b3IgaW52b2x2ZXMgdG8gY29tcHV0ZSB0aGF0IFBDMS4KCmBgYHtyfQoKIyBSdW4gU1BDQSB3aXRoIHZhcmlvdXMgYWxwaGEgcmVndWxhcmlzZXJzIHRvIGluY3JlYXNlIHNwYXJzaXR5IChvbmx5IDEwMCBnZW5lcyB0byBtYWtlIGl0IGZhc3QpCnJlc3VsdHNfc3BjYSA9IHRpYmJsZShzcGNhX2FscGhhX3ZhbHMgPSBjKDFlLTQsIDFlLTMsIDVlLTMsIDFlLTIsIDVlLTIpKQoKIyBEbyB0aGUgZml0cyBmb3IgZWFjaCBhbHBoYSB2YWx1ZQpyZXN1bHRzX3NwY2EgPC0gcmVzdWx0c19zcGNhICU+JQogIG11dGF0ZSgKICAgIHNwY2FfZml0ID0gcHVycnI6Om1hcCgKICAgICAgc3BjYV9hbHBoYV92YWxzLCAKICAgICAgZnVuY3Rpb24oYSkgc3BhcnNlcGNhOjpzcGNhKAogICAgICAgIGRhdGFfemVpc2VsX3RvcGdlbmVzWzE6MTAwLF0gJT4lIGxvZ2NvdW50cygpICU+JSB0KCksCiAgICAgICAgYWxwaGEgPSBhCiAgICAgICkKICAgICAgKQogICkKCiMgQ29tcHV0ZSBzdGF0aXN0aWNzIGZvciBlYWNoIFBDMToKcmVzdWx0c19zcGNhICU+JQogIG11dGF0ZSgKICAgIGdlbmVzX3VzZWQgPSAgcHVycnI6Om1hcCgKICAgICAgc3BjYV9maXQsCiAgICAgIGZ1bmN0aW9uKHgpewogICAgICAgIHN1bSh4JGxvYWRpbmdzWyxpXT4xZS02KQogICAgICB9KSwKICAgIHZhcl9leHBsYWluZWRfcGMxID0gcHVycnI6Om1hcCgKICAgICAgc3BjYV9maXQsCiAgICAgIGZ1bmN0aW9uKHgpewogICAgICAgIHN1bSh4JHNkZXZbWzFdXV4yKQogICAgICB9KQogICkgJT4lCiAgdW5uZXN0KGdlbmVzX3VzZWQsIHZhcl9leHBsYWluZWRfcGMxKQoKCgpgYGAKCmBgYHtyfQoKcmVzdWx0c19zcGNhX2FsbCA8LSBsYXBwbHkoCiAgc3BjYV9hbHBoYV92YWxzLCAKICAKKQoKcmVzdWx0c19wcmNvbXAgPC0gcHJjb21wKAogIGRhdGFfemVpc2VsX3RvcGdlbmVzICU+JSBsb2djb3VudHMoKSAlPiUgdCgpIAogICkKCnJlc3VsdHNfcHJjb21wX25vcm1hbGNvdW50cyA8LSBwcmNvbXAoCiAgZGF0YV96ZWlzZWxfdG9wZ2VuZXMgJT4lIGNvdW50cygpICU+JSB0KCkKKQoKIyBQbG90IGxvZyBjb3VudHMgUENBIHBsb3QKdG1wX3Bsb3RfbGVmdCA8LSBwbG90X3BjYV9yZXN1bHQocmVzdWx0c19wcmNvbXAsIGludGVyYWN0aXZlID0gRkFMU0UpICMgLCBkYXRhc2V0ID0gZGF0YV96ZWlzZWxfdG9wZ2VuZXMsIHNob3dfY2VsbHR5cGUgPSBUUlVFKQoKIyBQbG90IG5vcm1hbCBjb3VudHMgUENBIHBsb3QKdG1wX3Bsb3RfcmlnaHQgPC0gcGxvdF9wY2FfcmVzdWx0KHJlc3VsdHNfcHJjb21wX25vcm1hbGNvdW50cywgaW50ZXJhY3RpdmUgPSBGQUxTRSkKZ3JpZEV4dHJhOjpncmlkLmFycmFuZ2UodG1wX3Bsb3RfbGVmdCwgdG1wX3Bsb3RfcmlnaHQsIG5jb2w9MikKYGBgCgo=